Beyond Class A: A Proposal for Automatic Evaluation of Discourse

نویسندگان

  • Lynette Hirschman
  • Deborah A. Dahl
  • Donald P. McKay
  • Lewis M. Norton
  • Marcia C. Linebarger
چکیده

Introduct ion The DARPA Spoken Language communi ty has just completed the first trial evaluation of spontaneous query/response pairs in the Air Travel (ATIS) domain. 1 Our goal has been to find a methodology for evaluating correct responses to user queries. To this end, we agreed, for the first trial evaluation, to constrain the problem in several ways: D a t a b a s e A p p l i c a t i o n : Constrain the application to a database query application, to ease the burden of a) constructing the back-end, and b) determining correct responses; C a n o n i c a l A n s w e r : Constrain answer comparison to a minimal "canonical answer" that imposes the fewest constraints on the form of system response displayed to a user at each site; T y p e d I n p u t : Constrain the evaluation to typed input only; Class A: Constrain the test set to single unambiguous intelligible utterances taken without context that have well-defined database answers ("class A" sentences). These were reasonable constraints to impose on the first trial evaluation. However, it is clear that we need to loosen these constraints to obtain a more realistic evaluation of spoken language systems. The purpose of this paper is to suggest how we can move beyond evaluation of class A sentences to an evaluation of connected dialogue, including out-of-domain queries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Necessities of Developing Diverse Cultural Potentials in Academic Discourse

The absolute hegemony of international code of (academic) communication has resulted in the development and spread of the discoursal voice of the culture form which historical English has emerged, and, as a consequence, any violation from the generic conventions and thinking patterns born out of such a discourse has resulted in the deprivation of non-native thinkers form active participation in...

متن کامل

Examining Identity Options in Native and Nonnative Produced Textbooks Taught in Iran: A Critical Textbook Evaluation

Considering the crucial role textbook evaluation plays in any educational system, this study evaluated 2 textbook series with respect to the identity options they offer to Iranian learners of English. Data were gathered based on reading passages, dialogues, and pictures of Right Path to English (RPE) and Cambridge English for Schools (CES). Although this study is mainly qualitative in nature, q...

متن کامل

Public Spending on Health Service and Policy Research in Canada, the United Kingdom, and the United States: A Modest Proposal

Health services and policy research (HSPR) represent a multidisciplinary field which integrates knowledge from health economics, health policy, health technology assessment, epidemiology, political science among other fields, to evaluate decisions in health service delivery. Health service decisions are informed by evidence at the clinical, organizational, and policy level, levels with distinct...

متن کامل

Development and Usability Evaluation of an Online Tutorial for “How to Write a Proposal” for Medical Sciences Students

Background and Objective: Considering the importance of learning how to write a proposal for students, this study was performed to develop an online tutorial for “How to write a Proposal” for students and to evaluate its usability. Methods: This study is a developmental research and tool design. “Gamified Online Tutorial based on Self-Determination Theory (GOT-STD) Framework" became the basis f...

متن کامل

Contemporary methods for evaluating complex project proposals

The ability to evaluate project proposals, assessing future success, and organizational value is critical to overall business performance for most enterprises. Yet, predicting project success is difficult and often unreliable. A four-year field study shows that the effectiveness of available methods for evaluating and selecting large, complex project depends on the specific project type, org...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1990